Reliability Engineering and System Safety 201 (2020) 106964 








BES, 


Reliability Engineering and System Safety 


journal homepage: www.elsevier.com/locate/ress 


-- 


. s . . ME REI 
Contents lists available at ScienceDirect ENGINEERING 
“sao 








Data driven approach to risk management and decision support for dynamic ® 


positioning systems 


Check for 
updates 


Tarannom Parhizkar®™*, Sandra Hogenboom, Jan Erik Vinnem“, Ingrid Bouwer Utne’ 


* Department of Marine Technology, Norwegian University of Science and Technology (NTNU), Trondheim, Norway 


ARTICLE INFO 





Keywords: 

Risk management model 
Risked-informed decision-making 
Data driven model 

SPAR-H method 

Dynamic positioning (DP) system 
Decision support tool 


ABSTRACT 





Offshore oil and gas operations are inherently associated with risk and may have catastrophic consequences to 
life, property, and environment. Risk management is thus performed during the design, planning, and operation 
phases to control risk. Operational risk models are only periodically updated and do not always reflect the 
available real-time data. This is also the case for dynamic positioning (DP) operation. Monitoring the risk levels 
of the system during the operational phase could reduce the accident risk by providing additional decision 
support information for operators. 

In this paper, a framework for the risk management of DP operation is proposed to assist operators in de- 
cision-making. The risk management output will provide operators a real time risk status and pre-warnings of 
possible deviations in the system. This framework is developed to support the decision-making process of the 
operators with providing failure probability of alternative decision scenarios, and it can be applied to any other 


engineering system and operation. 

In order to validate the effectiveness of the framework, DP drilling operations are considered as a case study. 
The results demonstrate the value and effectiveness of the framework, which reduces the risk level of operations 
by contributing to the risk-informed decision-making of operators. 


1. Introduction 


Over the past few decades, online risk-informed decision-making 
has performed an increasingly important function in aerospace, nu- 
clear, and marine technologies [1,2]. The frequency of dynamic posi- 
tioning (DP) system failure is an ever-increasing problem, as reported in 
[3]. It is therefore necessary to further identify fundamental techniques 
for improving the DP system to reduce its failure frequency. 

The safety improvements resulting from online risk management 
applications lead to advanced improvements in system operation [4,5]. 
For this reason, online risk monitoring and risk management are ne- 
cessary to reflect system changes and enhance the understanding of the 
current safety state of a system. 

The online risk management framework updates the failure prob- 
ability of the system as necessary to account for the changes in system 
design and operation, thus improving system comprehension [6]. In 


contrast to conventional risk management methods, online risk man- 
agement is presented as a dynamic development process. Generally, 
online risk management is employed to reflect the real-time risk of the 
system, thus indicating the actual status of (sub) systems and opera- 
tional/environmental conditions. 

Moreover, during critical situations, the online decision support and 
alarm system may prevent critical unwanted events or provide earlier 
situation awareness and increased response time to allow for early 
manual intervention [7]. The online risk level of the system can be used 
as an input to a decision support tool in order to aid operators in 
making better decisions more expeditiously and efficiently. As shown in 
Fig. 1, the action result returns to the risk management model, and the 
updated risk value is calculated in real time; this is a continuous 
iterative process. 

Risk-informed decision-making has been applied for many years in 
different fields, and the role of risk insights in safety-related decision- 


Abbreviations: BN, Bayesian network; DP, dynamic positioning; DPO, dynamic positioning operator; ESD, event sequence diagrams; FPSO, floating production 
storage and offloading; HAZOP, hazard and operability analysis; HEP, human error probability; HFEs, human failure events; HMI, human-machine interface; HRA, 
human reliability analysis; IEs, initial events; IMCA, International Marine Contractors Association; LOP, loss of position; MLD, master logic diagram; MODU, mobile 
offshore drilling unit; MRP, marine riser package; NCS, Norwegian continental shelf; OIM, offshore installation manager; PRA, probabilistic risk assessment; PSFs, 
performance-shaping factors; SPAR-H, standardized plant analysis risk-human reliability analysis; WSOG, well specific operating guideline 

* Corresponding author at: Department of Marine Technology, NTNU, 7491 Trondheim, Norway. 


E-mail address: tarannom.parhizkar@ntnu.no (T. Parhizkar). 


https://doi.org/10.1016/j.ress.2020.106964 


Received 6 June 2019; Received in revised form 28 January 2020; Accepted 31 March 2020 


0951-8320/ © 2020 Elsevier Ltd. All rights reserved. 


T. Parhizkar, et al. 


Risk values 
Critical components and situations 





. Decision 
Online arene 
risk PP 
tool 
Frequencies- Prioritized action scenarios 
Decision scenarios 
Action 


Fig. 1. Risk-informed decision-making process. 


making has received considerable attention. For instance, in [8], an 
overall methodology for risk-informed decision-making is proposed. In 
[9], the necessity of structural repair of aging naval ships is investigated 
based on risk-informed decision-making. In [10], a value-risk graph 
that visualizes the risk level of alternative decisions in a manufacturing 
process is proposed. 

According to a literature review, however, research works related to 
risk-informed decision-making in dynamic positioning systems are 
limited [11,12]. In [7], the importance of decision support in reducing 
the shuttle tanker collision risk in floating production storage and off- 
loading is investigated. In their study, the hazard, barriers, and risk 
reduction potentials are assessed, and the necessity of considering ad- 
vanced estimation and data assimilation in risk management is dis- 
cussed. 

In this study, a novel framework that facilitates decision-making in 
DP systems is proposed. The main part of the risk-informed decision- 
making framework is the risk management model. The risk analysis of 
the DP system has been studied in detail at different complexity levels. 
In [13], a conceptual model for risk analysis is proposed. In most re- 
search works, the power structure of DP systems is investigated in detail 
[14,15]. Studies related to the risk and reliability analyses of overall DP 
systems are limited. In [16], a general fault tree for DP classes 1, 2, and 
3 is presented. This study indicates that fault trees can be employed for 
reliability-based design and maintenance scheduling of multi-megawatt 
capacity DP systems. The human and organizational factors, some of 
which are important in incident occurrence, however, are ignored [17]. 
In this study, these factors are considered in the risk management 
model. As a result, the decision-making process, as well as the human 
and organizational factors, is facilitated. 

The proposed risk management model quantifies the failure prob- 
abilities of different operating scenarios of DP systems and can be used 
as a basis for developing a decision support tool. One of the main parts 
in system failure quantification includes component failure frequencies. 
The sources and research on DP system failure mode quantification are 
also limited. In this study, failure modes are identified, and the rates of 
their occurrence are quantified based on the International Marine 
Contractors Association (IMCA) annual incident reports on DP systems 
(2004-2015). The data gathered from these reports are filtered (missing 
and inaccurate data are removed), and the failure frequencies of a 
generic DP drilling system are presented. Although incomplete, these 
data provide a comprehensive insight into the failure modes of DP 
systems. As a result, apart from the failure mode frequencies, the de- 
tailed fault trees and risk management model are proposed for DP 
systems. It should be noted that this dataset is used as a basis for the risk 
level calculation, and the failure frequencies are updated based on the 
information gathered over time, as presented in Fig. 1. 

The online risk management model should satisfy two basic re- 
quirements to be applicable in a decision support tool [18]. First, risk 
level updating should reflect the information on real system config- 
uration; second, a rapid solution to support the real-time application of 
risk management and decision-making is necessary. The feasibility of 
online risk management and decision support tool therefore 


Reliability Engineering and System Safety 201 (2020) 106964 


considerably depends on the conversion and calculation times of the 
solution methodology. It is thus important to develop a highly efficient 
calculation engine to enable a rapid solution of the online risk man- 
agement model. 

Probabilistic risk assessment (PRA) is a systematic and compre- 
hensive methodology for evaluating risks associated with a complex 
system while considering the uncertainties of operational and en- 
vironmental conditions [19]. In the present study, an efficient solution 
approach based on the PRA method is proposed. Any significant 
changes in the system risk level can therefore be perceived by the op- 
erator, providing a basis for risk-informed decision-making. The full 
system description and boundaries utilized in this study are provided in 
Section 3.1. The main contributions of this study are summarized as 
follows. 


e A comprehensive risk management framework for DP operations is 
developed. 

e Human and organizational factors are considered in the risk man- 
agement model. 

e Failure frequencies are calculated according to IMCA reports from 
2004 to 2015. 


The paper is organized as follows. A brief overview of the developed 
framework is introduced in Section 2. The details of the framework are 
presented in Section 3. In Section 4, a DP drilling unit is presented as a 
case study and results are presented. In Section 5, the results are ana- 
lyzed, and the usefulness and drawbacks of the proposed framework are 
discussed. Finally, conclusions and contributions are presented in 
Section 6. 


2. General concept of risk-informed decision making 


A conceptual framework of risk-informed decision-making is pre- 
sented in Fig. 2. 

As presented in Fig. 2, the first step is data collection. Some data are 
non-observable; hence, these data cannot be considered in the mod- 
eling. These data are the main source of model uncertainty, and this 
limitation is further presented in the Discussion section (Section 5.4- 
Model uncertainties). Operators’ beliefs and desires in controlling the 
system are examples of non-observable data. 

Observable data can be considered in the modeling process and are 
categorized as online or offline data. Offline data generally include 
design parameters and system characteristics, because they could be 
used as model input only at a specific instance. For example, the size of 
machineries, system dimensions, and material characteristics of com- 
ponents could be considered as offline data. Online data, however, 
serve as continuous model input. Operating and environmental condi- 
tions are examples of online data, which are updated through time. 
Observable data, including offline and online data, are collected and 
monitored for further analysis in the first step of the framework. 

In the next step, the failure probability of the system is calculated. 
The failure probability and risk level are quantified by analyzing the 
system fault tree, event tree, and Bayesian network models. The 
methodology of risk management is discussed in further detail in suc- 
ceeding sections. The last step includes decision-making; in this step, 
the results of the risk management model are analyzed and presented to 
support system operators in decision-making. 


3. Risk management model 


An overview of the proposed risk management methodology is 
presented in this section. Fig. 3 illustrates the flow diagram of sub- 
models of the proposed methodology. In the first step, the boundary of 
the system should be defined. Based on the system boundary, the pre- 
ferred end states can be determined. Thereafter, initial events that can 
lead to the realization of the target end state are determined. In the next 


T. Parhizkar, et al. 
I 

I 

i] 

| 

I 

1 

I 

i] 

1 

I 

! 

I 
> i 
= -_- i 
I 

i] 

I 

I 

I 

! 

I 

I 

1 

I 

I 

i] 

L 


Bayesian network model 


. Nonobservable data 


. Observable data > 
* Offline data 


Data Collection 


e Online data ———————————»_ Condition monitoring 








Reliability Engineering and System Safety 201 (2020) 106964 


* Environmental 
e Operational and technical 
* Human and organizational 





Fault tree 


Fig. 2. Conceptual framework of risk-informed decision-making process. 


step, event sequence diagrams (ESD), which can aid in deriving event 
trees, are constructed. Finally, fault trees, Bayesian networks, and de- 
cision support models are developed. 

In the following sub-sections, the steps of risk management model 
are presented; and these steps are applied to the dynamic positioning 
system as a case study. A dynamic positioning (DP) system is a com- 
puter-controlled system that can automatically maintain a vessel's po- 
sition and heading or a predefined track by controlling its own pro- 
pellers and thrusters. Fig. 4 shows the basic components of a DP system. 

It can be observed that the power generation system, switchboard, 
and thrusters are connected to the DP control system, which gathers 
data from multiple sensors, including the reference system for position 
(PRS), gyro compass for heading, motion reference units (MRU), and 
environmental (wind) sensors. The dynamic positioning operator (DPO) 


Boundary of study |>| End state l-> Initialevents ->| Event sequence diagram p- 


obtains information from the DP control system using the DP console, 
which is a human-machine interface (HMI) that provides useful in- 
formation pertaining to the status of components. The DPO could also 
access other information by means of communication systems, system 
alarms, and alarm system traffic light. 


3.1. Boundary of the study 


In the first step, the boundaries of the analysis should be defined, 
and all components and influencing factors of the system are de- 
termined. Based on the desired output, some components/factors are 
ignored. In the analyzed DP system, the components considered are the 
propulsion (thruster) system, power system, computer system, and re- 
ference system. Moreover, the factors examined include the 


eer 1 


- Event tree |- Fault tree |-> Bayesian network |-> Decision support model 


Fig. 3. Sub-models of risk management model. 


T. Parhizkar, et al. 





Wind sensor 
Warning aii <- 
D 
mi 
Communication ~ PRS 


terme «yy 
¿3 











Interface 


Alarm 


a DP controll 


Gyro compass 


T 





UPS 


Switchboard 


m 


F 


Le: 
LEL 


Thrusters 





i 
v 





Power generation system 





Fig. 4. Dynamic positioning system [20]. 


Environment Operators 


Į 4 


DP system components 


Fig. 5. System boundaries. 





environmental and operational conditions (Fig. 5). Human and orga- 
nizational elements are regarded as influencing factors of the system. 


3.2. End states 


The end state is a specific situation to be investigated in the final 
phase of system operation. In this study, the main end state is the loss of 


Table 1 
Initial events in propulsion system [21]. 


Component Initial events 


Thruster unit and drive 
Control system 


Reliability Engineering and System Safety 201 (2020) 106964 


position. The end state determines the initial events and critical func- 
tions that should be included. 


3.3. Initial events 


In the third step, the initial events (IEs), which can be defined based 
on different methods (e.g., hazard and operability analysis (HAZOP) or 
master logic diagram (MLD)), should be identified. Thereafter, the IEs 
should be quantified. There are three types of IEs from the quantifica- 
tion perspective. Some IEs are singular events, and their frequency can 
be calculated based on historical data (e.g., tropical storm in a parti- 
cular geographical area). Some IEs are more complex and require a 
fault tree to estimate their frequencies (e.g., inadvertent disconnection 
of the lower marine riser package). Occasionally, some IEs are condi- 
tional (e.g., drift-off because of bad weather). In these cases, the de- 
pendencies of IEs should be considered in the frequency quantification. 
The Bayesian network approach is typically employed to determine the 
frequency of IEs and data uncertainties. The initial events of the studied 
DP system are categorized based on the components where events 
emanate, as summarized in Tables 1—4. 


3.4. Event sequence diagram 


The event sequence diagram (ESD) shows the interactions of events 
arranged in a time sequence leading to different end states. This dia- 
gram can aid in developing a system of event trees [22]. In this study, 
the IMCA incident reports from 2004 to 2015 are considered as the 
most probable ESDs in the DP system [23]. An example of an IMCA 
incident report is shown Fig. 6. 


3.5. Event tree 


An event tree is an inductive analytical diagram in which an event is 
analyzed using Boolean logic to examine the chronological series of 
subsequent events or consequences. The event tree of the system should 
be developed based on the ESDs. In this research, the event tree is 
developed by analyzing all IMCA incident reports from 2004 to 2015. 
The general event tree for DPs is illustrated in Fig. 7. The first four 
events present a system loss of position (LOP), and the last event 
maintains the position status, indicating that the system operates 
properly (OK). 


3.6. Fault tree 


In this step, the possible failure modes are identified [24]. The list 
should be extensive and include all possible failure modes. Those per- 
ceived as not probable to occur or with negligible consequence can be 
eliminated from further consideration at this stage [6]. A fault tree 
should be developed for the remaining failure modes. In this study, the 
main failure modes are identified according to the ESDs from IMCA 


Drive short circuits; Main coupling error; DC motor field problem; Overheating; Loose wires 
Pitch or RPM anomalies; Control unit PLC error; Network, communication error; Loose wires/incorrect wiring; Fuse, relay, PCB, and 


signal amplifier errors; Outstation internal power distribution error; Faulty emergency stop button 


Feedback signal 
failure 

Hydraulics (CPP, clutch) 
Gearbox and clutch failure 

Propulsion auxiliary systems Lubrication; cooling pump failure 

Cooling, lubrication, air, and ventilation 

AC converter 

Main engine 

Human error 


Loose wires/connector; Loose or broken linkages; Faulty potentiometers; Incorrect feedback signal; Feedback failure; Speed sensor 


Control valves, proportional, solenoid, and limit switch errors; Low hydraulic oil pressure/pitch pump failure; Hydraulic oil leaks; 


Thruster brake failure; Cooling, water leak, and thermostat failures; Oil leaks 

AC converter, Rectifier, Inverter, and DC link failures 

Scavenge air fan; high exhaust temperature 

Slip; Lapse; Rule-based mistake; Knowledge-based mistake; Routine violation; Optimization violation; Necessary violation 


T. Parhizkar, et al. 


Table 2 
Initial events in reference system [21]. 


Sub-component Initial event 


Interference 
Aeration; False target; Atmospheric interference 
Software 
required 
Mechanical 
funnel; Position 
Communication 


Loss of satellite feed 
Electrical, hardware Defective card; Loose Connector; Low feed voltage 
Service and maintenance 


Human error 


Table 3 
Initial events in control system [21]. 


Sub-component Initial event 


Reliability Engineering and System Safety 201 (2020) 106964 


High sun activity; Interference from other telecommunication systems; Physical obstruction; Near operations (close proximity to other vessels); Water 
Software error, “OK” after reboot; Software “freeze”; Software “bug”; Wrong settings and IP address; Calibration-insufficient T/C/QA; Update 
Damage caused by corrosion and wear; Taut wire fault; Antenna errors; Damaged/faulty sensor unit or deployment equipment; Heat and fumes from 


Poor differential correction signals from ground station; Ground station computer failure; Signal error caused by satellite maintenance/satellite fault; 


Slip; Lapse; Rule-based mistake; Knowledge-based mistake; Routine violation; Optimization violation; Necessary violation 


Software Software upgrade and tuning failures; Software modeling problem; Software “bug” (error, flaw, failure, or fault); Computer “freeze”; Anomaly (controller/ 
operator station problems); Virus 

Hardware Motherboard failure; Hard drive and circuit board failures; Card failure; Insufficient cooling; Power supply, transmission, and distribution failures; Loose wire; 
Hardware component failure 

UPS Cooling fan failure; Burnt Card; Voltage control failure; Loose connection; Faulty switch; Charger failure 


Human error 


Table 4 
Initial events in power system [21]. 


Sub-component Initial event 


Slip; Lapse; Rule-based mistake; knowledge-based mistake; Routine violation; Optimization violation; Necessary violation 


PMS Automatic disconnection of breakers; Faulty controller; Incorrect setup 


Control equipment 


Governor actuator failure; overspeed; incorrect settings; Incorrect setting of generator actuator; Sensor (e.g., speed) failure; Automatic fuel filter failure; 


Protection equipment failure (harmonic filter); Loose wire; Air supply failure 


Excitation Systems 
Fuel system Fuel pump failure; 

Compressed air failure; 

Blocked filters; 

Heavy fuel oil separator failure; 
Water contamination in fuel tanks; 
Fuel pipe leak; 

Fuel oil valve failure 


Voltage and frequency fluctuations; AVR errors; Faulty Diode plate or exciter; Exciter anomaly 


Oil system Pressure drop of isolation valve; Oil pressure sensor failure; Oil leak causing low pressure 
Cooling system Blocked SW strainers; Valve failure; Thermostat failure; Low cooling water pressure; SW pump failure; Leaks 
Electrical 


Engine component 
Human error 


reports. The fault tree of propulsion, reference, power systems, auto- 
matic control, and manual control of a DP are presented in Figs. 8-12, 
respectively. Depending on the specific dynamic positioning opera- 
tional requirements, the systems are assigned to one of four DP cate- 
gories (DP classes 0-3). In this study, DP class 2 is explored. DP class 
2vessels have redundancy so that no single fault in an active system will 
cause the system to fail. 

The system can be automatically or manually controlled; their fault 
trees are presented in Figs. 11 and 12, respectively. 


3.7. Bayesian network 


Human errors can directly and indirectly lead to loss of position. 
Reason [25] has defined a human error taxonomy based on the three 
levels of human performance [26]: skill-based, rule-based, and knowl- 
edge-based actions. Skill-based actions routinely and automatically 
occur. Rule-based actions are responses partially controlled and par- 
tially performed automatically that are focused on problems. Knowl- 
edge-based actions are controlled and applied to novel problems. 
Human error is defined as the failure of planned actions to realize a goal 
(without the interference of certain unanticipated events), [27]. These 


Internal short circuit of generator; Fuel supply relay; Earth fault short circuit; Distribution; Cabling; Blown fuses; Transformer, invertor failure 
Bearing failure; Rocker gear failure; Mechanical failure; Engine overload; Fuel injector failure 
Slip; Lapse; Rule-based mistake; knowledge-based mistake; Routine violation; Optimization violation; Necessary violation 


types of errors can be categorized as slips, lapses, rule- and knowledge- 
based mistakes. Another type of error involves the violations and de- 
viations from safe operating procedures. These safety violations are not 
malevolent in intent and may be classified as routine, optimizing, and 
necessary [25]. 

Human errors can result in the failure of a system function and 
should be considered in the online risk model [28]. In this study, human 
reliability analysis (HRA) is employed to estimate the human error 
probability (HEP) for human failure events (HFEs) in fault trees. In HRA 
methods, the human error probability is the conditional probability of a 
failure event given the performance context. The context is analyzed 
with the standardized plant analysis risk-human reliability analysis 
(SPAR-H), which is a method that can quantify human risks in a system 
[7] and assess the probability of human errors for a known context. The 
analyst determines the underlying context by selecting a set of perfor- 
mance-shaping factors (PSFs), which are discretized into levels or 
states. 

It is not always possible, however, to gather perfect information on 
the PSF level, and in some cases, the PSFs are not directly measurable or 
observable [2]. No theory related to the direct causal relationship be- 
tween various PSFs and human error types exists, indicating that it is 


T. Parhizkar, et al. 


2xDGPS & 
2x ACOUSTICS 
ONLINE 


VESSEL ON DP 
IN DEEPWATER 


DRILLING 
OPERATIONS 


WIND SQUALL 
FROM ASTERN 


INCREASE IN 
THRUST & POWER 


2 DIESEL 
GENERATORS TRIP 


DP REDUCES 
THRUSTER PITCH 





EMERGENCY 


RENNE DISCONNECT 


POSITION LOST 


(520m) 





Fig. 6. Sample incident report from IMCA annual DP report [23]. 








"S Yes 
Citiating event)» Propulsion system failure LOP1 
No L Yes 
Reference system failure ————————————————> LOP2 
No L , Yes 
Power system failure ———————~ LOP3 


No E 7 Yes 
Control system failure —— LOP4 


N 
of OK 


Fig. 7. Event tree of DP systems. 


not clear to which degree for example stress affects lapses. All PSFs are 
therefore linked to all human error types. The relationship between 
human error types and affected systems is obtained from the data found 


Propulsion fault 


Thruster 2 fault 


Thruster 2 R1 fault 








Thruster 1 fault 













Feedback Hydraulics 
signal error fault 


A A 


Unit and drive Control sys 
fault fault 





Thruster 3 fault 
Thruster 2 R2 fault 


AUX sys fault 


Reliability Engineering and System Safety 201 (2020) 106964 


in the IMCA reports. 

In order to reduce the subjectivity associated with the estimated PSF 
and the HEP from the analyst, prior probabilities in the Bayesian net- 
work (BN) are determined, and the network is updated using available 
observations. Bayesian networks afford several advantages that can 
enhance the PRA method in human reliability assessment. The Bayesian 
network of a DP system for the HRA is formulated, as shown in Fig. 13. 
It should be noted that the first layer (PSFs) is connected to all human 
error types; for simplicity, however, this is not included in the figure. 

These Bayesian networks calculate the human error probabilities in 
the reference, control, thruster, and power system. Apart from the 
foregoing, human errors can also occur when the system operates in the 
manual mode, as presented in Fig. 12. The failure probability of the 
manual control system can be calculated based on the Bayesian network 
model in Fig. 14. 


3.8. Decision support tool 


In the decision support model, alternative decisions should be de- 
fined. The risk management model is thereafter employed to assess the 
risk associated with each alternative decision. The decision with the 
lowest risk should typically be the preferred decision alternative; 
however, cost and availability are also important factors that can in- 
fluence the decision. 

Most DP drilling operations in the Norwegian Continental Shelf 
(NCS) follow a set of well specific operating guidelines (WSOG) that 
includes procedures/guidelines for operations. It specifies the limits for 
different DP parameters that indicate the state of the DP system and 
operation. Typically, there are four states: green, white, yellow, and 
red. In the green state, all DP operation parameters are within accep- 
table limits, and the operation can proceed normally. When the DP 
operation is in a white state, some of the redundancies are lost, or 
weather conditions have worsened. This state is also called the advisory 
state because of the necessity for calling an advisory meeting to discuss 
how operations should proceed. In the yellow state, even more re- 
dundancies are lost, and/or weather conditions have further worsened; 
the vessel remains capable of maintaining position, but the probability 
is low. In the red state, the mobile offshore drilling unit (MODU) is no 
longer capable of maintaining position, the operation should be 
aborted, and under most circumstances, the unit should be moved to 
the safe zone at the earliest possible time. 

The WSOG is extremely prescriptive and stipulates the actions re- 
quired in case the DP system loses its redundancies, or weather con- 
ditions are unfavorable. This is especially true for the green state (all 


Thruster 4 fault 










Cooling, 
lubrication, air 
ventilation fault 


A 


Main 
AC converter engine 


Fig. 8. Propulsion system fault tree of a DP class 2 system. 


T. Parhizkar, et al. 


Reliability Engineering and System Safety 201 (2020) 106964 


Reference fault 


MRUs fault 






GPSs fault 





Mechanical 


Interference 
fault 


A 


Software error 


GPS R1 fault GPS R2 fault 


Faulty 
prob O&M 


Communication 


N 





Heading 
sensing 
gyroscope fault 


Wind speed & 
direction 
sensors fault 


A 







Current sensor 
fault 


GPS R3 fault 








Electrical fault 


Fig. 9. Reference system fault tree of a DP class 2 system. 


Power sys fault 






Bus 
coupler 


| 
Power generation 
fault 
| breaker, 
Bi 
Power gen R1 Power gen R2 Power gen R3 
fault fault fault 


Fuel sys fault Oil sys fault 























Cooling sys fault 


A 







Power gen R4 
fault 


Engine 
components 
fault 


Power 
management fault 


l 
PMS R1 fault PMS R2 fault 


Control equip Excitation sys 
fault fault 


A A 













Electrical fault 





Fig. 10. Power system fault tree of a DP class 2 system. 


systems are operational, and all conditions are within operational 
limits) and red state (redundancy within the DP system is compromised 
to the extent that the vessel encounters difficulty in or is not capable of 
maintaining position). During the advisory and yellow states, the time 
for evaluation remains available because the capability of maintaining 
position is not directly threatened. Situational factors, however, should 
be evaluated to determine the risks associated with various decision 
alternatives, i.e., whether the MODU stays or moves to the safe zone. It 
is certain that additional decisions have to be made. If the MODU stays, 
are additional risk reduction measures necessary? If it is moved to the 
safe zone, what is the best way to do so? The decision support tool 
presented in this paper aims to provide input in making these decisions. 


4. Risk management model applied to a MODU drift-off incident at 
Skarv field 


An incident investigation report of a MODU is used to study the 
applicability of the developed framework. Mobile offshore drilling units 
are employed in the exploratory offshore drilling of new oil and gas 
wells. They rest on columns and pontoons and can be moored with 
anchors. For deep water drilling operations, however, these units (such 
as the studied MODU in the presented incident investigation report) 
rely on the DP system to maintain position. The DP system of MODUs 
follows the same rules as other types of DP systems, and their risk levels 
could be calculated using the method presented in Section 3. 

In the following sections, the incident is first described according to 


T. Parhizkar, et al. Reliability Engineering and System Safety 201 (2020) 106964 


Control sys fault 










Power fault 


Computer fault 








UPS R2 fault 


Faulty switch Card burnt 


Fig. 11. Automatic control system fault tree of a DP class 2 system. 








Control sys fault 





Controlling fault 


Power generation UPS R1 fault UPS R2 fault 
fault 
Hardware R2 fault 
| 


Fig. 12. Manual control system fault tree of a DP class 2 system. 














Manual 
control error 


Work 


Fitness for Available 


Procedures 
processes 


























Mistake 
rule-based 


Routine 
violation 


Mistake 
knowledge-based 


Optimizing 
violation 
















Control 
system 


Thruster 
system 


Reference 
system 





Fig. 13. BN for HRA of DP class 2 system (Section 3.6). 


T. Parhizkar, et al. 


Fitness for 












Mistake 
rule-based 


Mistake 
knowledge-based 







Manual control 
system 






Available 






Reliability Engineering and System Safety 201 (2020) 106964 


Procedures abit 
processes 


Routine 
violation 





time 








Optimizing 
violation 


Fig. 14. BN for HRA of DP class 2 manual control system. 


the investigation report, and the different scenarios generated to study 
the effectiveness of the proposed model are elaborated. In Section 4.2, a 
data flow diagram of the proposed risk management model for the case 
study is presented. This diagram indicates the required input to the 
model introduced in Section 4.3. Finally, the results of different sce- 
narios are discussed and compared in Section 4.4. 


4.1. Description of incidents and scenarios 


In the Deepsea Stavanger platform, on April 2018 at 12:38, it is 
observed that the MODU has lost position during a riser workover and 
float tree installation on well A-03 [29]. A detailed course of events is 
presented in Fig. 15. During this incident, the MODU has exceeded the 
limits for the advisory, yellow, and red states. The WSOG for the 
Deepsea Stavanger specifies that in the red state, the offshore installa- 
tion manager (OIM) has the discretion to decide whether the situation 
threatens the equipment and what actions to take. In this case, the OIM 
observed that the speed at which the MODU was moving off position 


rai k Z i N \ 7 
j / f 








was decreasing and accordingly expected that the vessel will im- 
mediately regain position. The OIM therefore decided not to disconnect 
and move the MODU to the safe zone; instead, the automatic disconnect 
was inhibited while waiting for the MODU to regain its position with 
the DP system. The vessel regained position after having a maximum 
excursion of 12 m. The ESD of the incident is presented in Fig. 15. 

In this study, the event is analyzed using the SPAR-H method. Apart 
from the scenario described in the incident report (Scenario 1), an al- 
ternative scenario is also analyzed (Scenario 2). In this alternative 
scenario, it is assumed that the OIM is not on the bridge and that the 
dynamic positioning operator (DPO) is responsible for the decision- 
making and handling of the loss of position. This scenario is included to 
present a realistic alternative decision-making situation and how this 
affects the risk status. 

In Scenario 1 (Fig. 15), both the OIM and DPO are on the bridge 
when several high waves push the MODU off position. Moreover, the 
thrust to keep the MODU in position is insufficient because the ongoing 
drilling operation is consuming power; consequently, drift-off occurs. 


f N Fa 


Continued loss of | | | 


several “demand 
reduced” alarms. 
Demand was 
marginally above 
the limit and 
position keeping 
capabilities were 
not affected. 





Operators noticed 
the alarm, but did 
not get the 
anticipated 
consequence 
analysis alarm, 
instead they got a 
network overflow 
alarm that took 
their attention 
away from the 
power alarm. 











Changes in power 
configuration 


Due to the drilling 
operation the driller 
changed the power 
configuration from 
a 2-split to a 4-split 
mode with 1 diesel 
generator 
supporting each of 
the 4 switchboards 
and all 8 thrusters 
running 








At 12.38 one or 
several waves forced 
the MODU off 
position in the 
direction of the 
environmental 
forces. The DPO and 
OIM expected the 
MODU to be able to 
regain position, but 
there was 
insufficient thrust 
available. 


Seconds later 
insufficient thrust 
and demand 


reduced alarms went 


off on all SWBDs. 





advisory, yellow (at 
7m), and red (at 
11m) watch. Before 
reaching the red 
limit the OIM 
instructed the DPO 
to inhibit the Auto 
Disconnect. The 
decision was based 
on the reduction in 
speed and the 
expectancy that 
there was sufficient 
thrust to regain 
position to avoid an 
emergency 
disconnect with 
associated 
consequences. The 
decision was in line 
with internal 


procedures. 


Fig. 15. Event sequence of Deepsea Skarv incident based on investigation report [29]. 


| position 
| 
Morning of the The MODU 
incident Increased continued to lose 
System gives environmental position and 
forces exceeded the 


Normalization and 
investigation 


The vessel regained 
position after 
having a max. 
excursion of 12 
meters. The 
investigation found 
out that the power 
configuration left 
the DP system with 
less residual power 
than assumed. 








T. Parhizkar, et al. 


Between the first wave and the 12-m excursion, the maximum response 
time is 1 min. Scenario 2 is similar to Scenario 1. The MODU is pushed 
off position by several high waves. Further, the thrust necessary for the 
MODU to maintain position is insufficient because the ongoing drilling 
operation is consuming power; as a result, drift-off occurs. The differ- 
ence between Scenario 1 and 2 is the manning on the bridge. In Sce- 
nario 1, similar to the incident on the Deepsea Stavanger, the OIM and 
DPO are both on the bridge. In Scenario 2, however, it is assumed that 
the OIM is unable to reach the bridge on time, leaving the DPO to 
handle the drift-off alone. 

The loss of position probabilities under automatic and manual 
conditions are calculated for each scenario. The control systems in the 
ESD for manual and automatic modes differ. In the automatic mode, the 
DP model controls the system, and the failure probability is calculated 
using the presented fault tree in Fig. 11. In the manual mode, however, 
with a human operator involved, the failure probability is calculated 
using the fault tree in Fig. 12 and the Bayesian network in Fig. 14. 


4.2. Data flow diagram of case study 


In this section, the data flow of the proposed model and the deri- 
vation of results from input parameters are presented. The data flow for 
the reference system is shown in Fig. 16. The data flows for the thruster, 
control, and power system are the same. The first layer involves BNs. In 
this level, human error probabilities are calculated based on PSFs and 







LOP due to bad weather 


ANo 


(3) 
| 
1 
Í 
1 
ASAIN URS TS 
, EEA 
A A 
(2) IMCA reports 


Components failure frequencies 





m 
— 
— 


[tn nn a nnn nnn ns nn on nn nn oe en nn nn nnn en nn nenne- 


Propulsion system failure 


N ; 
oL, Reference system failure 
Yes 

Power system failure ———————————> LOP3 


: Yes 
i Li Control system failure ——+ LOP4 


` 
MRUs fault | | GPSs fault Wind speed & direction sensors fault 


oa a SS 5 5 9 ESSE SEES 





Reliability Engineering and System Safety 201 (2020) 106964 


IMCA incident reports. Its outputs are human errors related to power, 
propulsion, control, and reference systems. The values and failure fre- 
quencies of components obtained from IMCA reports update the basic 
event probabilities/frequencies in all fault trees. The figure shows the 
fault tree of the reference system as an example. The model considers 
all other fault trees, including those of power, propulsion, and control 
systems. The failure probabilities of these systems are then used as 
input by the ESD, and the probabilities are calculated according to the 
initial event. 


4.3. Frequency/probabilities of basic events in case study 


As presented in Fig. 16, the frequency/probability of basic events 
should be considered as model input. In this section, these values are 
evaluated based on available historical data. Specifically, the fre- 
quency/probability of basic events are derived using the IMCA annual 
incident data on DP drilling units from 2004 to 2015. The failure fre- 
quencies of components are presented in Section 4.3.1. In addition, the 
human error probabilities are quantified using BNs, as presented in 
Figs. 13 and 14. The quantification process is presented in 
Section 4.3.2. 


4.3.1. Component failure frequencies 
Operating hours 
The operational times of DP vessels from 2004 to 2010 are gathered 


Yes 





LOP1 
Yes 


> LOP2 













fA 


| 
À À 





.ascseccssssol 


= 


Fig. 16. Data flow of case study. 


10 


T. Parhizkar, et al. 


Table 5 

Operational times of drilling DP from 2004 to 2015 worldwide [30]. 
Year 2004 2005 2006 2007 2008 
Operating hours (year) 49.4 49.7 50.3 50.3 72.1 


Table 6 
Failure frequencies of thruster sub-systems in DP drilling unit based on IMCA 
incident report. 


Failure group Causes Frequency per 
hour! 
Thruster unit and drive Main coupling error 1.22E — 07 
DC motor field problem 1.22E—07 
Unknown 4.88E — 07 
Control system Control unit PLC error 2.44E — 07 
Fuse, relay, PCB, and signal 2.44E — 07 
amplifier errors 
Outstation internal power 2.44E — 07 
distribution errors 
Unknown 6.09E — 07 
Feedback signal Loose or broken linkages 1.22E—07 
Faulty potentiometers 1.22E — 07 
Hydraulics (CPP, clutch) Control valve, proportional, 1.22E— 07 
solenoid, and switch errors 
Low hydraulic oil pressure/ 1.22E — 07 
pitch pump failure 
Cooling, lubrication, air, Thruster brake failure 1.22E— 07 
and ventilation Oil leaks 1.22E — 07 


' Numbers are rounded off for the purpose of reporting. 


from drilling semi-submersibles and drill ships worldwide [30], and the 
operating hours for the other years are estimated based on the available 
data trend. The total operational time over a 12-year span is 937 years. 
Frequencies 
The failure frequency is calculated from Eq. (1), and the number of 
incidents is gathered from IMCA reports. The operating hours are 
summarized in Table 5. 


_ Number of incidents 


fr= 


Hours of operation 


(1) 


The failure frequencies of all initial events of components are 
summarized in Tables 6-9. 


4.3.2. Human and organizational error frequencies 

The human and organizational errors are quantified based on the 
BN model. The structure of the BN for the DP drilling unit is presented 
in Fig. 13. The structure is derived based on information gathered from 
IMCA reports. 

The human failure event for these two scenarios is defined as failure 
to handle a drift-off. The focus is on the diagnosis and decision-making 
in incident scenarios. The PSFs of the two scenarios are presented in 


Table 7 
Failure frequency of reference sub-systems in DP drilling unit based on IMCA 
incident report. 


Failure group Causes Frequency per 
hour’ 

Interference Physical obstruction 1.22E—07 
Atmospheric interference 1.22E—07 

Software Wrong settings, IP address 1.22E—07 
Calibration-insufficient T/C/QA 1.22E—07 
Unknown failures of the software 1.22E—07 

Mechanical Damaged/faulty sensor unit or 4.88E — 07 
deployment equipment 

Communication Loss of satellite feed 3.66E — 07 

Unknown Unknown 1.22E—07 


' Numbers are rounded off for the purpose of reporting. 


11 


Reliability Engineering and System Safety 201 (2020) 106964 


2009 2010 2011 2012 2013 2014 2015 
80.6 89.5 87.5 93.2 98.9 104.7 110.4 
Table 8 


Failure frequency of control sub-systems in DP drilling unit based on IMCA 
incident report. 


Failure group Causes Frequency per hour’ 
Software Software modeling issues 3.66E — 07 
Anomaly (controller/operator station 4.88E — 07 
problems) 
Hardware Motherboard failure 1.22E-— 07 
Hardware components failure 1.22E—07 
UPS Unspecified errors 2.44E— 07 
Faulty switch 1.22E—07 


' Numbers are rounded off for the purpose of reporting. 


Table 9 
Failure frequencies of power sub-systems in DP drilling unit based on IMCA 
incident reports. 


Failure group Causes Frequency per 
hour’ 
Control equipment Faulty controller 1.22E— 07 
Unknown 1.22E—07 
Governor actuator failure, overspeed, 3.66E — 07 
and incorrect settings 
Generator actuator and incorrect 2.44E— 07 
setting 
Sensor (e.g., speed) failure 1.22E—07 
Protection equipment failure 2.44E—07 
(harmonic filter) 
Unknown 1.22E—07 
Electrical Generator internal short circuit 2.44E — 07 
Fuel supply relay failure 1.22E—07 
Short circuit and earth fault 3.66E — 07 
Distribution, cabling, grounding, and 1.22E—07 
blown fuses 
Transformer and invertor failure 1.22E—07 
Bus bar failure 1.22E—07 
Engine component Engine overload 3.66E — 07 
Fuel system Compressed air failure 2.44E—07 
Fuel pipe leak 1.22E—07 
Cooling system Blocked SW strainers 1.22E—07 
Unspecified errors 1.22E—07 


' Numbers are rounded off for the purpose of reporting. 


Table 10. 
The parameters in the table are summarized as follows. 


(a) Based on the information available from the incident, sufficient 
time is available to perform the action; hence, the available time is 
considered nominal. 

(b) Based on the available time and with the OIM on bridge, it is 
assumed that the DPO has sufficient time to act during the incident. 
(c) Based on the description of the incident, an automatic or manual 
emergency disconnect would have been triggered if its functionality 
is not inhibited. Without the OIM on the bridge, however, only the 
DPO made the decision to restrain the disconnect, which had po- 
tential safety and financial consequences. 

(d) In the Skarv incident, the OIM had a good overview of the si- 
tuation and can control it. The PSF stress is therefore considered 
nominal. 

(e) In the incident description, it is not evident whether the diag- 
nosis is easy or difficult. Accordingly, complexity is evaluated as 
having a nominal effect on HEP. 


T. Parhizkar, et al. 


Table 10 
PSF factors of Deepsea Skarv incident and alternative scenario without OIM on 
bridge. 


PSFs PSF levels Evaluation of PSFs of | Evaluation of PSFs of 
incident with OIM incident without OIM 
on bridge Scenario 1 on bridge Scenario 2 
Available time Nominal a b 
Stress High — c 
Nominal d - 
Complexity Nominal e f 
Training High g — 
Nominal — h 
Procedures Available but - i 
poor 
Nominal j ~ 
Ergonomics Nominal k l 
Fitness for duty Insufficient — m 
information 
Nominal n — 
Work processes Insufficient — o 
information 
Nominal p — 


(f) In the incident description, it is not evident whether the diagnosis 
is easy or difficult. Accordingly, complexity is evaluated as having a 
nominal effect on HEP. 

(g) The OIM is considered to have an extensive experience with DP. 
Training and experience are therefore evaluated as high. 

(h) This scenario is hypothetical; hence, no assumptions are made 
on the level of experience or training of the DPO. 

(i) The limited description of the scenario is presumed to lead to a 
difference in approach between the DPO and OIM. The procedure 
indicates that handling such a situation is dependent on the OIM's 
discretion. If the OIM is not available, however, then it is assumed 
that the DPO may make the decision to disconnect and move the 
MODU to a safe zone to stabilize the power distribution. 

(j) Procedures are available and are relatively simple and straight- 
forward; however, they are not particularly descriptive. The effect of 
procedures is therefore considered to be nominal. 

(k) No information is available on the state of ergonomics/HMI. 

(1) No information is available on the state of the ergonomics/HMI. 
(m) This situation is hypothetical; hence, no assumptions are made 
on the fitness for duty of the DPO. 

(n) The OIM is capable of performing tasks, but performance de- 
gradation is not described. The PSF is therefore considered nominal. 
(o) In the incident description, there is insufficient information on 
the effect of work processes on the performance of DPO. 

(p) In the incident description, decision-making seems to have been 
performed unilaterally by the OIM. It is therefore considered that 
work processes are not performance drivers. 


Table 11 summarizes the PSFs for a DPO action in the DP drilling 
unit. 
The conditional probabilities are calculated by Eq. (2) [31]. 


E 0.01 x [[PSFs 
~ 0.01 x ([] PSFs — 1) +1 


0.001 x [] PSFs 


0.001 x ([] PSFs — 1) +1 (2) 


The first term represents the conditional probabilities of the human 
error probability (HEP) of a diagnostic task (e.g., decision-making), and 
the second term shows the HEP of an action task (e.g., pushing a 
button). The conditional probabilities of slip, rule-based mistake, and 
knowledge-based mistake nodes are calculated using Eq. (2). It should 
be noted that for the two factors, i.e., inadequate available time and 
unfitness for duty, the final conditional probabilities are assigned to be 
1, regardless of the other PSF levels. 

The conditional probabilities from human error to failures in the 
reference, control, propulsion, and power systems are calculated based 


12 


Reliability Engineering and System Safety 201 (2020) 106964 


Table 11 
PSFs of case study. 


PSFs PSF levels Multiplier 


Available time Inadequate time P(failure) = 1 


Nominal 1 
Stress High 2 
Nominal 1 
Complexity Highly complex 5 
Nominal 1 
Training High 0.5 
Nominal 1 
Procedures Available but poor 5 
Nominal 1 
Ergonomics Poor 10 
Nominal 1 
Fitness for duty Unfit P(failure) = 1 
Nominal 1 
Work processes Poor 2 
Nominal 1 


on the information gathered from IMCA reports. 


4.4. Results 


In this section, the results of the application of risk management 
framework in the case study are presented. As mentioned earlier, two 
scenarios are compared. 


e Scenario 1: The offshore installation manager (OIM) is on the bridge 
and has the discretion to decide whether the situation is a threat to 
equipment and what actions to take. 

e Scenario 2: The OIM is not on the bridge, and the DPO is solely 
responsible for decision-making and handling of loss of position. 


The first three rows present the three human error types (slip, rule- 
based mistake, and knowledge-based mistake) considered in this study. 
It can be observed that slip and mistake probabilities significantly in- 
crease from Scenario 1 to scenario. The OIM is not on the bridge in the 
second scenario; hence, human error probability increases. This type of 
error is largest in the rule-based mistake category because of the sig- 
nificance of training and experience on the PSF. 

The rows in Table 12 indicate that human errors occur in different 
components, such as power, reference, thruster, and control systems. It 
is observed that human errors are lower in Scenario 1 because of the 
presence of the OIM. 

As presented in the ESD (Fig. 7), five end states, including the LOPs 
caused by propulsion system failure, reference system failure, power 
system failure, control system failure, and keep safe position (“OK”), 


Table 12 
Human error probabilities for different components in two scenarios. 
State Scenario 1 Scenario 2 
Slip Yes 0.0022 0.0222 
No 0.9978 0.9778 
Rule-based mistake Yes 0.0110 0.1011 
No 0.9890 0.8989 
Knowledge-based mistake Yes 0.0110 0.0511 
No 0.9890 0.9489 
Human error in power system Healthy 0.9921 0.9432 
Faulty 0.0079 0.0568 
Human error in reference system Healthy 0.9953 0.9641 
Faulty 0.0047 0.0359 
Human error in thruster system Healthy 0.9981 0.9819 
Faulty 0.0019 0.0181 
Manual control error No error 0.9779 0.8358 
With Error 0.0221 0.1642 
Automatic control system Healthy 0.9916 0.9430 
Faulty 0.0084 0.0570 


T. Parhizkar, et al. 


Table 13 
End state probabilities in manual and automatic modes for two scenarios. (LOP: 
loss of position). 


Scenario 1 Scenario2 

Automatic Manual Automatic Manual 
LOP 1 0.0020 0.0020 0.0181 0.0181 
LOP 2 0.0042 0.0042 0.0315 0.0315 
LOP 3 0.0056 0.0056 0.0391 0.0391 
LOP 4 0.0040 0.0218 0.0271 0.1496 
“OK” 0.9843 0.9664 0.8842 0.7616 


are defined for the case study. Table 13 summarizes the probabilities of 
occurrence of end states in the ESD. In the list, two operation modes are 
considered for each scenario. 


e Automatic mode: the DP system automatically maintains the vessel 
position using the control system and related actuators. 

e Manual mode: operators use a joystick to control the position and 
heading of the DP system. 


It is observed that for the first three end states (LOP 1-3), there is no 
difference between the probability of failures under the manual and 
automatic operation modes. This lack of difference is attributed to the 
fact that the power, thruster, and reference systems are independent 
from the manner the systems are controlled. The fourth end state (LOP 
4), however, presents the control system failure, and the probability of 
this state is considerably affected by the control method (manual or 
automatic). 

The significance of human factor is observed in End 4 where the 
probability of failure considerably increases between the automatic and 
manual operation modes for both scenarios. This increased probability 
in the manual mode is more significant under Scenario 2, due to lower 
manning where the human factor is more consequential due to higher 
stress level and lower level of training and procedure. Comparison of 
the two scenarios indicate the importance of the training factor since 
the probability of failure in the manual mode under Scenario 1 is even 
lower than the probability failure in the automatic mode under 
Scenario 2. 

The details summarized in Table 13 could aid operators in making 
better decisions. In the list, the “OK” probabilities of automatic and 
manual modes could be compared, and the operator could select the 
mode of operation based on these values. In these two scenarios, the 
probability of “OK” in the automatic mode is higher; it is therefore more 
advantageous to attempt to use the automatic mode in maintaining the 
vessel position. 

The unavailability of various systems is listed in Table 14. The va- 
lues indicate the proportion of time that DP components are not in a 
functioning condition, as calculated by Eq. (3): 


_ Down time _ MTTR 


~ Total time  MTTR + MTTF (3) 


where MTTR is the mean time of repair; MTTF is the mean time of 
failure. The unavailability values for the manual and automatic modes 
for the thruster, reference, and power systems are the same, but dif- 
ferent for the control system. In Scenario 2, all values are significantly 


Table 14 
Unavailability of DP components in two scenarios. 
Scenario 1 Scenario 2 

Thruster system 0.0020 0.0181 
Reference system 0.0048 0.0359 
Power system 0.0079 0.0568 
Automatic control system 0.0084 0.0570 
Manual control system 0.0221 0.1642 


13 


Reliability Engineering and System Safety 201 (2020) 106964 


higher because of the human error factor and the effect of training and 
procedure in Scenario 1. The largest increase in unavailability between 
Scenario 1 and Scenario 2, however, occurs in the manual control 
system where the human factor is more critical. 

The results summarized in Tables 13 and 14 could serve as input to 
the decision support tool. The unavailability of different DP compo- 
nents is part of the risk levels associated with remaining in location. The 
results also indicate means of leaving the location (if necessary) by 
using the unavailability values of the manual and automatic control 
systems as input. 


5. Discussion 
5.1. Applicability of proposed risk management model 


In order to evaluate the effectiveness of the model, a case study with 
two operating scenarios is employed. The comparison between the re- 
sults derived from the two scenarios indicates the sensitivity of the 
model to the PSFs of an operator. According to the results, when the 
stress level is high and the procedures are inadequate, the system 
failure probability is high (as expected). This demonstrates the useful- 
ness of the model. Moreover, two operational modes, manual and au- 
tomatic, in each scenario are compared. These comparisons illustrate 
how the model effectively reacts to the change in system defaults. For 
example, when the operating mode is shifted from automatic to 
manual, the human error level would be higher. This is mainly because 
human interaction in the manual mode would be higher, thereby 
leading to a high level of human error, as clearly indicated by the re- 
sults. 


5.2. Implications of presented results 


According to the analysis data from IMCA reports, the fault trees of 
the DP system are highly dependent on vessel type and operational 
mode. 

The presented model is applicable for a general type of DP class 2 in 
a drilling vessel (Section 3.6). As the component redundancy options for 
DP class 3 are improved is far wide and multiple alternative combina- 
tions can be designed for DP class 3, we focused on DP class 2 for more 
generalizing. In addition, DP class 3 analysis will provide higher re- 
liability level; however, the relative risk level between the scenarios 
will remain the same. So, the provided results are applicable for DP 
class 3 as well. 

Practically, the values of initial event frequencies, as well as con- 
ditional probabilities of human errors should be continuously updated 
according to the incident data of a vessel that utilizes the model. As 
presented in Fig. 16, the input parameters of the model at each step are 
updated. Each reported failure updates the human error probability or 
failure frequency according to the initial failure event. The recorded 
failures resulting from human error updates the conditional prob- 
abilities of human error in step 1 of the diagram because this step in- 
dicates the system failure probabilities caused by human errors. 
Moreover, the failures resulting from system technical problems update 
the failure frequencies in step 2 of the diagram because this step shows 
the failure rate of components. 

It is further observed that in all cases, the automatic mode has a 
lower failure probability. This is mainly because human interaction in 
the manual mode is more considerable compared with that in the au- 
tomatic mode, and human error significantly affects the failure prob- 
ability of the system. As a result, the model suggests automatic mode to 
the decision support tool as it has a lower failure probability in most 
cases. In reality, however, other factors, such as cost and time, may 
have a considerable impact on the decision-making process. A decision 
support tool that simultaneously considers all factors should therefore 
be developed. 


T. Parhizkar, et al. 


5.3. Quality of input data 


The study and parts of the analyses considerably rely on the IMCA 
reports and available information. It is difficult, however, to verify the 
quality of incident information included in the IMCA reports because 
they are not in their original form. It is also possible that the original 
incident investigation is not sufficiently thorough to provide insights on 
the causal factors or is not performed by people with the appropriate 
competencies. The incident descriptions in the IMCA reports are typi- 
cally brief and insufficient in detail, thereby requiring interpretation 
and subjectivity. This is therefore among the limitations of data quality. 
Another possibility is that underreporting has to be assumed. The IMCA 
reports are based on incidents that have been voluntarily shared by 
various vessel owners. It is possible, however, that a complete overview 
has not been provided. The extent of to which incident details are 
missing from the IMCA reports can only be surmised. It is generally 
important to note that data quality can be compromised at any stage of 
the data process because of the following reasons. 


1 Underreporting 

2 Missing or incomplete data or errors in data collection and entry 

3 Differences in the application and comprehension of variable defi- 
nitions 


In this study, it is assumed that there is no bias in the available data 
obtained from IMCA. Datasets, however, can be assessed for levels of 
underreporting and data quality through comparison with other data- 
bases. A common comparison to make is between IMCA reports and 
available investigation reports. Another means is the utilization of the 
failure rate of components. Although these evaluations are extremely 
useful, it is impossible to determine the real frequencies and failure 
rates because the exact intersection of the two databases cannot be 
obtained [32]. 

The drift off at the Skarv field, is used to exemplify the model. The 
incident report is in its original form that is provided by the vessel's 
operator. The extent to which the details are provided to perform a 
proper human reliability analysis, however, is insufficient in this report. 
The information is therefore used to perform a coarse SPAR-H, and 
some assumptions have to be made particularly on hypothetical 
Scenario 2. Assumptions that were made include there being sufficient 
time available for the DPO in Scenario 2, and the procedures being 
supportive for the OIM in Scenario 1, but less helpful to the DPO in 
Scenario 2. Moreover, the response from the DPO in Scenario 2 is also 
assumed, based on procedures. The assumptions are also presented in 
Table 5. The analysis was also affected by the hindsight bias, as are all 
incident investigations. 

The authors believe that despite the limitations in verifying the data 
quality from incident reports, the information utilized in this study 
provides a useful starting point for risk model development, which can 
be updated at a later stage when necessary. In addition, it is typically 
not possible to successfully collect data for every incident in a DP 
system, but not all incidents need to be reported to be able to draw 
conclusions and identify key priorities to improve DP safety [33], or 
make better decisions. 


5.4. Model uncertainties 


Non-observable data, such as behavioral or mental states (e.g., op- 
erator belief and desire) that influence human error, also exist. These 
types of data influence the risk level of the system. In this study, BN is 
employed to infer latent variables. The child nodes of BN that take input 
parameters such as stress level or fitness for duty of the operator could 
consider data uncertainty. In this study, it is assumed that all input data 
have a 10% uncertainty. 

It should be noted that the modeling results should be used in the 
decision-making process of the operator, and the level of uncertainty 


14 


Reliability Engineering and System Safety 201 (2020) 106964 


does not affect the comparison results between the two scenarios be- 
cause it similarly influences them. Simply out, the results can be un- 
certain, but the ratio of uncertainty for all scenarios remain the same 
when we compare manual and automatic modes. The level of un- 
certainty, therefore, does not change the category of the scenario's risk 
level. 


5.5. Online and dynamic risk management models 


The proposed model can be used online because its input para- 
meters can be updated continuously. The input parameters include the 
ESD initiator, operator characteristics accepted BNs, and failure fre- 
quency of components. The model can be used as an online tool to 
facilitate the decision-making process because its response time is less 
than 10 s. 

The model, however, is not dynamic. Dynamic risk methodologies 
are generally those that utilize a time-dependent phenomenological 
model of system evolution and consider stochastic behavior to estimate 
the risk associated with the system response to an initiating event. 
These methodologies also employ a predictive model that generates 
branches at each user-specified time step or condition with their asso- 
ciated probabilities and computes the probability of each scenario. This 
feature provides a prediction of system behavior for each scenario that 
should be added to the model in future works. 


5.6. Model Improvements and future work 


5.6.1. Model improvement: decision support tool 

The information from the risk management model can provide input 
to the decision support tool. The aim of the decision support tool is to 
provide the DPO with an overview of the risks associated with relevant 
decision alternatives. As mentioned earlier, the main judgement that a 
DPO has to make is whether or not she/he can safely remain on location 
or needs to move off location. The risk management model proposed in 
this paper can provide input into this decision process by presenting the 
failure probability and uncertainties of the current operation, including 
the risk of potential future failures and different decision alternatives. 


5.6.2. Model improvement: decision alternatives 

At present, the model does not include information regarding the 
consequences of decision alternatives. In the case of DP drilling units, 
one of the main barriers and consequences of the loss of position is well 
disconnection. The primary problem in drilling operations is the 
maintenance of hydrocarbon containment. In order to prevent the loss 
of containment, the DPO can decide to use the functionality of manually 
disconnecting the unit. If the available time is not sufficient, then the 
automated emergency disconnection should be activated when the 
MODU exceeds the red limit in its loss of position. The risks associated 
with these two scenarios (manual vs. automated disconnection) and the 
potential success of the disconnection in maintaining hydrocarbon 
containment can considerably differ. A manual disconnection situation 
can allow the DPO to communicate with the driller on the imminent 
disconnection so that the driller can secure the equipment in a manner 
that would optimize the success of disconnection (e.g., the drill bit does 
not block the shearing rams in the BOP). Additionally, the strain on the 
riser will be higher during the automated emergency disconnect versus 
the manual disconnect, increasing the risk of tearing the riser. This risk 
remains even though the riser angle limit at which the automated dis- 
connect is set to be activated should protect the riser from tearing. 


5.6.3. Model improvement: environmental factors 

Environmental factors impact the risk associated with decision al- 
ternatives, which will be included in future work. The future conditions 
of weather, waves, and currents have a significant effect on the MODU's 
ability to maintain position and should be considered in the risk asso- 
ciated with decision alternatives. 


T. Parhizkar, et al. 


5.6.4. Model improvement: weighting of decision outcome parameters 

The risk management model in this study focuses on major risk 
hazards. There are also other risk, that should be considered. These can 
include time, material damage, and loss of reputation, which may in- 
fluence the identification of the most optimal decision alternative. The 
authors are aware that the DP is simply a means for enabling complex 
operations. As such, it is not the “objective.” The DP is necessary to 
enable drilling in certain areas. Drilling windows and schedules are 
disrupted by the choice to shift locations, and such decisions should be 
seriously contemplated. The primary focus of risk models is safety be- 
cause an accident can be even more costly. 

In future works, cost and time will be considered as decision out- 
come parameters. The weighting of these parameters, as well as the 
major hazard risk parameter, is a sensitive subject. 


6. Conclusion 


This paper proposes a risk management framework for a dynamic 
positioning system to assist operators in decision-making. The output of 
the risk management framework provides operators with a real-time 
risk status that can aid in making better decisions within a limited time. 
The framework presented in this paper also takes a more holistic ap- 
proach to the risk modeling of DP operations by including human error 
scenarios both as initiating events and potentially escalating events. 

The paper proposes a general risk management model for DP class 2 
that can be used in a decision support tool. The developed model is 
based on 15 years of historical incident data of DP-related accidents and 
incidents. Moreover, human and organizational factors are considered 
in the risk management model using the Bayesian network model and 
the network is trained based on the SPAR-H method. 

The proposed modelling approach is generic, and in this paper, it is 
applied to a DP drilling unit as a case study. The frequencies and 
probabilities of initial events of the DP drilling unit are first determined 
by cleaning the gathered data from the IMCA reports. Human errors are 
thereafter quantified using the SPAR-H method and based on relevant 
PSFs. In order to evaluate the framework effectiveness, two scenarios in 
automatic and manual modes are thereafter compared. 

The results show that the SPAR-H and Bayesian network approaches 
are potential methods for considering the human and organizational 
factors in risk management. Moreover, the comparison results between 
the manual and automatic modes show that the proposed risk man- 
agement model can be an appropriate tool for risk-informed decision- 
making. In this study, the decision-making process is based on system 
failure probability. Other parameters, such as cost and time limitation, 
could be implemented in the developed framework. 


Declaration of Competing Interest 


The authors declare that they have no known competing financial 
interests or personal relationships that could have appeared to influ- 
ence the work reported in this paper. 

The authors declare the following financial interests/personal re- 
lationships which may be considered as potential competing interests: 


CRediT authorship contribution statement 


Tarannom Parhizkar: 
Sandra Hogenboom: Data 
editing. Jan Erik Vinnem: 
review & editing, 
Conceptualization, 
Supervision. 


Data curation, Writing - original draft. 
curation, Investigation, Writing - review & 
Conceptualization, Methodology, Writing - 
Supervision. Ingrid Bouwer Utne: 
Methodology, Writing - review & editing, 


15 


Reliability Engineering and System Safety 201 (2020) 106964 


References 


[1] Volkanovski A, Cepin M. Implication of PSA uncertainties on risk-informed decision 
making. Nucl Eng Des 2011;241(4):1108-13. 

Mehr AF, Irem TY. Risk-based decision-making for managing resources during the 
design of complex space. J Mech De. 2006;128(4):1014-22. 

The International Marine Contractors Association. Dynamic positioning station 
keeping review. Incidents and events reported for 2016. International Marine 
Contractors Assosiation, 2018. 

Parhizkar T, Aramoun F, Esbati S, Saboohi SY. Efficient performance monitoring of 
building central heating system using Bayesian Network method. J Build Eng 
2019;26:100835. 

Utne IB, Schjølberg I, Roe E. High reliability management and control operator risks 
in autonomous marine systems and operations. Ocean Eng 2019;171:399-416. 
Parhizkar T, Balali S, Mosleh A. An entropy based bayesian network framework for 
system health monitoring. Entropy 2018;20(6):416. 

Vinnem JE, Utne IB, Schjølberg I. On the need for online decision support in 
FPSO-shuttle tanker collision risk reduction. Ocean Eng 2015;101:109-17. 

NASA N. Risk-informed decision making handbook. Washington, DC: NASA 
Headquarters; 2010. 

Liu Y, Frangopol DM, Cheng M. Risk-informed structural repair decision making for 
service life extension of aging naval ships. Mar Struct 2019;64:305-21. 

Shah LA, Etienne A, Siadat A, Vernadat FB. Performance visualization in industrial 
performance visualization in industrial. IFAC-PapersOnLine 2018;51(11):552-7. 
Kongsvik T, Almklov P, Haavik T, Haugen S, Vinnem JE, Schiefloe PM. Journal of 
loss prevention in the process industries decisions and decision support for major 
accident prevention in the process industries. J Loss Prev Process Ind 
2015;35:85-94. 

Yang X, Haugen S. Risk information for operational decision-making in the offshore 
oil and gas industry. Saf Sci 2016;86:98-109. 

B. Rokseth, I.B. Utne, and J.E. Vinnem, “A systems approach to risk analysis of 
maritime operations,” vol. 231, no. 1, pp. 53-68, 2017. 

Bø TI, Johansen TA, Sorensen AJ, Mathiesen E. Dynamic consequence analysis of 
marine electric power plant in dynamic positioning. Appl Ocean Res 2016;57:30-9. 
V. Khorasani, “Risk assessment of diesel engine failure in a dynamic positioning 
system,” Master Thesis, Faculty of Science and Technology, University of Stavanger, 
2015. 

Vedachalam N, Ramadass GA. Reliability assessment of multi-megawatt capacity 
offshore dynamic positioning systems. Phys Procedia 2017;63:251-61. 
Hogenboom S, Rokseth B, Vinnem JE, Utne IB. Human reliability and the impact of 
control function allocation in the design of dynamic positioning systems. Reliab Eng 
Syst Saf 2020;194:106340. 

Parhizkar T, Aramoun F, Saboohi Y. Efficient health monitoring of buildings using 
failure modes and effects analysis case study: air handling unit system. J Build Eng 
2020;29:101113. 

N. NASA, “Probabilistic Risk Assessment Procedures Guide for Offshore 
Applications (DRAFT) report”, 2017. https://www.bsee.gov/sites/bsee.gov/files/ 
ProbalisticRiskAssessment%20%28PRA%29/bsee_pra_procedures guide -_10-26- 
17.pdf. 

"www.dynamicpositioning.guru.com," 2015. [Online]. 

I. Pil, “Causes of dynamic positioning system failures and their effect on dp vessel,” 
Master Thesis, Estonian Maritime Academy, Tallinn University of Technology, 
2018. 

Wells G. Hazard identification and risk assessment. IChemE; 1996. 

The International Marine Contractors Association, “Dynamic positioning station 
keeping review, incidents and events reported for (2004—2015),” International 
Marine Contractors Assosiation, 2005-2017. 

Stamatis DH. Failure mode and effect analysis: FMEA from theory to execution. ASQ 
Quality press; 2003. 

Reason J. Managing the risks of organizational accidents. Routledge; 2016. 
Rasmussen J. Human errors. A taxonomy for describing human malfunction in in- 
dustrial installations. J Occup Accid 1982;4(2—4):311-33. 

Reason J. Human error. New York: Cambridge University Press; 1990. 

Groth K, Wang C, Mosleh A. Hybrid causal methodology and software platform for 
probabilistic risk assessment and safety monitoring of socio-technical systems. 
Reliab Eng Syst Saf 2010;95(12):1276-85. 

Hakonsund K. Investigation report level 2, Rig excrusion reached red watch circle. 
Odfjell Drilling, Synergi 2018;209204:1-19. 

Chen H, Nygard B. Quantified risk analysis of DP operations-principles and chal- 
lenges. SPE International Conference and Exhibition on Health, Safety, Security, 
Environment, and Social Responsibility. Society of Petroleum Engineers; 2016. 
Groth KM, Swiler LP. Use of a SPAR-H Bayesian network for predicting human error 
probabilities with missing observations. 11th international probabilistic safety as- 
sessment and management conference. 2012. 

R. Woodward, “The organisation for economic co-operation and development 
(OECD),” Routledge, 2009. 

S. Thompson, “Safety Data Management and Governance, Developing and 
Implementing Safety Data Business Plans report,” US department of Transportation, 
Federal highway administration, 2016. https://safety.fhwa.dot.gov/rsdp/manage. 
aspx. 


[2] 
[3] 


[4] 


[5] 
[6] 
[7] 
[8] 
[9] 
[10] 


[11] 


[12] 
[13] 
[14] 


[15] 


[16] 


[17] 


[18] 


[19] 


[20] 
[21] 


[22] 
[23] 
[24] 


[25] 
[26] 


[27] 
[28] 


[29] 


[30] 


[31] 


[32] 


[33] 


